Overview

Brought to you by YData

Dataset statistics

 Dataset ADataset B
Number of variables1212
Number of observations446446
Missing cells453437
Missing cells (%)8.5%8.2%
Duplicate rows00
Duplicate rows (%)0.0%0.0%
Total size in memory45.3 KiB45.3 KiB
Average record size in memory104.0 B104.0 B

Variable types

 Dataset ADataset B
Numeric45
Categorical54
Text33

Alerts

Dataset ADataset B
Sex is highly overall correlated with SurvivedSex is highly overall correlated with SurvivedHigh correlation
Survived is highly overall correlated with SexSurvived is highly overall correlated with SexHigh correlation
Parch is highly imbalanced (53.6%) Alert not present in this datasetImbalance
Age has 102 (22.9%) missing values Age has 85 (19.1%) missing values Missing
Cabin has 351 (78.7%) missing values Cabin has 351 (78.7%) missing values Missing
PassengerId has unique values PassengerId has unique values Unique
Name has unique values Name has unique values Unique
SibSp has 312 (70.0%) zeros SibSp has 315 (70.6%) zeros Zeros
Fare has 7 (1.6%) zeros Fare has 9 (2.0%) zeros Zeros
Alert not present in this datasetParch has 341 (76.5%) zeros Zeros

Reproduction

 Dataset ADataset B
Analysis started2025-03-24 21:58:56.2505212025-03-24 21:58:57.864474
Analysis finished2025-03-24 21:58:57.8613542025-03-24 21:59:00.093510
Duration1.61 second2.23 seconds
Software versionydata-profiling v0.0.dev0ydata-profiling v0.0.dev0
Download configurationconfig.jsonconfig.json

Variables

PassengerId
Real number (ℝ)

 Dataset ADataset B
Distinct446446
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean444.5852427.79372
 Dataset ADataset B
Minimum32
Maximum891891
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:00.195526image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum32
5-th percentile50.2548.25
Q1222.5213.75
median448.5413.5
Q3654.75640.75
95-th percentile849.75841.75
Maximum891891
Range888889
Interquartile range (IQR)432.25427

Descriptive statistics

 Dataset ADataset B
Standard deviation255.16398253.23946
Coefficient of variation (CV)0.573937190.59196628
Kurtosis-1.1691995-1.1508941
Mean444.5852427.79372
Median Absolute Deviation (MAD)217215.5
Skewness0.0254728390.093424544
Sum198285190796
Variance65108.65764130.223
MonotonicityNot monotonicNot monotonic
2025-03-24T21:59:00.338913image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
480 1
 
0.2%
857 1
 
0.2%
22 1
 
0.2%
726 1
 
0.2%
509 1
 
0.2%
532 1
 
0.2%
855 1
 
0.2%
477 1
 
0.2%
210 1
 
0.2%
264 1
 
0.2%
Other values (436) 436
97.8%
ValueCountFrequency (%)
195 1
 
0.2%
358 1
 
0.2%
740 1
 
0.2%
184 1
 
0.2%
312 1
 
0.2%
381 1
 
0.2%
145 1
 
0.2%
569 1
 
0.2%
368 1
 
0.2%
136 1
 
0.2%
Other values (436) 436
97.8%
ValueCountFrequency (%)
3 1
0.2%
4 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
12 1
0.2%
13 1
0.2%
18 1
0.2%
21 1
0.2%
22 1
0.2%
ValueCountFrequency (%)
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
9 1
0.2%
13 1
0.2%
14 1
0.2%
19 1
0.2%
20 1
0.2%
ValueCountFrequency (%)
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
9 1
0.2%
13 1
0.2%
14 1
0.2%
19 1
0.2%
20 1
0.2%
ValueCountFrequency (%)
3 1
0.2%
4 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
12 1
0.2%
13 1
0.2%
18 1
0.2%
21 1
0.2%
22 1
0.2%

Survived
Categorical

 Dataset ADataset B
Distinct22
Distinct (%)0.4%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
0
281 
1
165 
0
282 
1
164 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446446
Distinct characters22
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row10
2nd row10
3rd row01
4th row01
5th row01

Common Values

ValueCountFrequency (%)
0 281
63.0%
1 165
37.0%
ValueCountFrequency (%)
0 282
63.2%
1 164
36.8%

Length

2025-03-24T21:59:00.440762image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-03-24T21:59:00.487991image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:59:00.522221image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 281
63.0%
1 165
37.0%
ValueCountFrequency (%)
0 282
63.2%
1 164
36.8%

Most occurring characters

ValueCountFrequency (%)
0 281
63.0%
1 165
37.0%
ValueCountFrequency (%)
0 282
63.2%
1 164
36.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 281
63.0%
1 165
37.0%
ValueCountFrequency (%)
0 282
63.2%
1 164
36.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 281
63.0%
1 165
37.0%
ValueCountFrequency (%)
0 282
63.2%
1 164
36.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 281
63.0%
1 165
37.0%
ValueCountFrequency (%)
0 282
63.2%
1 164
36.8%

Pclass
Categorical

 Dataset ADataset B
Distinct33
Distinct (%)0.7%0.7%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
3
248 
1
108 
2
90 
3
245 
1
104 
2
97 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446446
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row12
2nd row23
3rd row32
4th row31
5th row31

Common Values

ValueCountFrequency (%)
3 248
55.6%
1 108
24.2%
2 90
 
20.2%
ValueCountFrequency (%)
3 245
54.9%
1 104
23.3%
2 97
 
21.7%

Length

2025-03-24T21:59:00.579525image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-03-24T21:59:00.628929image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:59:00.671898image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
3 248
55.6%
1 108
24.2%
2 90
 
20.2%
ValueCountFrequency (%)
3 245
54.9%
1 104
23.3%
2 97
 
21.7%

Most occurring characters

ValueCountFrequency (%)
3 248
55.6%
1 108
24.2%
2 90
 
20.2%
ValueCountFrequency (%)
3 245
54.9%
1 104
23.3%
2 97
 
21.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 248
55.6%
1 108
24.2%
2 90
 
20.2%
ValueCountFrequency (%)
3 245
54.9%
1 104
23.3%
2 97
 
21.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 248
55.6%
1 108
24.2%
2 90
 
20.2%
ValueCountFrequency (%)
3 245
54.9%
1 104
23.3%
2 97
 
21.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 248
55.6%
1 108
24.2%
2 90
 
20.2%
ValueCountFrequency (%)
3 245
54.9%
1 104
23.3%
2 97
 
21.7%

Name
['Text', 'Text']

 Dataset ADataset B
Distinct446446
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:01.025733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length5667
Median length4749
Mean length26.36771326.556054
Min length1312

Characters and Unicode

 Dataset ADataset B
Total characters1176011844
Distinct characters5859
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique446446 ?
Unique (%)100.0%100.0%

Sample

 Dataset ADataset B
1st rowWick, Mrs. George Dennick (Mary Hitchcock)Funk, Miss. Annie Clemmer
2nd rowBeesley, Mr. LawrenceNankoff, Mr. Minko
3rd rowOreskovic, Mr. LukaBecker, Master. Richard F
4th rowOlsen, Mr. Henry MargidoRyerson, Miss. Emily Borie
5th rowToufik, Mr. NakliBidois, Miss. Rosalie
ValueCountFrequency (%)
mr 266
 
15.0%
miss 95
 
5.3%
mrs 47
 
2.6%
william 30
 
1.7%
master 24
 
1.3%
henry 21
 
1.2%
john 19
 
1.1%
james 14
 
0.8%
joseph 11
 
0.6%
thomas 11
 
0.6%
Other values (875) 1241
69.8%
ValueCountFrequency (%)
mr 266
 
14.9%
miss 94
 
5.3%
mrs 60
 
3.4%
william 38
 
2.1%
john 23
 
1.3%
henry 21
 
1.2%
master 19
 
1.1%
james 14
 
0.8%
george 13
 
0.7%
richard 12
 
0.7%
Other values (870) 1230
68.7%
2025-03-24T21:59:01.543412image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1333
 
11.3%
r 956
 
8.1%
e 831
 
7.1%
a 829
 
7.0%
i 664
 
5.6%
n 631
 
5.4%
s 630
 
5.4%
M 568
 
4.8%
l 502
 
4.3%
o 487
 
4.1%
Other values (48) 4329
36.8%
ValueCountFrequency (%)
1344
 
11.3%
r 971
 
8.2%
e 823
 
6.9%
a 808
 
6.8%
i 667
 
5.6%
s 660
 
5.6%
n 648
 
5.5%
M 562
 
4.7%
l 532
 
4.5%
o 505
 
4.3%
Other values (49) 4324
36.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11760
100.0%
ValueCountFrequency (%)
(unknown) 11844
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1333
 
11.3%
r 956
 
8.1%
e 831
 
7.1%
a 829
 
7.0%
i 664
 
5.6%
n 631
 
5.4%
s 630
 
5.4%
M 568
 
4.8%
l 502
 
4.3%
o 487
 
4.1%
Other values (48) 4329
36.8%
ValueCountFrequency (%)
1344
 
11.3%
r 971
 
8.2%
e 823
 
6.9%
a 808
 
6.8%
i 667
 
5.6%
s 660
 
5.6%
n 648
 
5.5%
M 562
 
4.7%
l 532
 
4.5%
o 505
 
4.3%
Other values (49) 4324
36.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11760
100.0%
ValueCountFrequency (%)
(unknown) 11844
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1333
 
11.3%
r 956
 
8.1%
e 831
 
7.1%
a 829
 
7.0%
i 664
 
5.6%
n 631
 
5.4%
s 630
 
5.4%
M 568
 
4.8%
l 502
 
4.3%
o 487
 
4.1%
Other values (48) 4329
36.8%
ValueCountFrequency (%)
1344
 
11.3%
r 971
 
8.2%
e 823
 
6.9%
a 808
 
6.8%
i 667
 
5.6%
s 660
 
5.6%
n 648
 
5.5%
M 562
 
4.7%
l 532
 
4.5%
o 505
 
4.3%
Other values (49) 4324
36.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11760
100.0%
ValueCountFrequency (%)
(unknown) 11844
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1333
 
11.3%
r 956
 
8.1%
e 831
 
7.1%
a 829
 
7.0%
i 664
 
5.6%
n 631
 
5.4%
s 630
 
5.4%
M 568
 
4.8%
l 502
 
4.3%
o 487
 
4.1%
Other values (48) 4329
36.8%
ValueCountFrequency (%)
1344
 
11.3%
r 971
 
8.2%
e 823
 
6.9%
a 808
 
6.8%
i 667
 
5.6%
s 660
 
5.6%
n 648
 
5.5%
M 562
 
4.7%
l 532
 
4.5%
o 505
 
4.3%
Other values (49) 4324
36.5%

Sex
Categorical

 Dataset ADataset B
Distinct22
Distinct (%)0.4%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
male
300 
female
146 
male
292 
female
154 

Length

 Dataset ADataset B
Max length66
Median length44
Mean length4.65470854.690583
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters20762092
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowfemalefemale
2nd rowmalemale
3rd rowmalemale
4th rowmalefemale
5th rowmalefemale

Common Values

ValueCountFrequency (%)
male 300
67.3%
female 146
32.7%
ValueCountFrequency (%)
male 292
65.5%
female 154
34.5%

Length

2025-03-24T21:59:01.634022image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-03-24T21:59:01.688327image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:59:01.720912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male 300
67.3%
female 146
32.7%
ValueCountFrequency (%)
male 292
65.5%
female 154
34.5%

Most occurring characters

ValueCountFrequency (%)
e 592
28.5%
m 446
21.5%
a 446
21.5%
l 446
21.5%
f 146
 
7.0%
ValueCountFrequency (%)
e 600
28.7%
m 446
21.3%
a 446
21.3%
l 446
21.3%
f 154
 
7.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2076
100.0%
ValueCountFrequency (%)
(unknown) 2092
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 592
28.5%
m 446
21.5%
a 446
21.5%
l 446
21.5%
f 146
 
7.0%
ValueCountFrequency (%)
e 600
28.7%
m 446
21.3%
a 446
21.3%
l 446
21.3%
f 154
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2076
100.0%
ValueCountFrequency (%)
(unknown) 2092
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 592
28.5%
m 446
21.5%
a 446
21.5%
l 446
21.5%
f 146
 
7.0%
ValueCountFrequency (%)
e 600
28.7%
m 446
21.3%
a 446
21.3%
l 446
21.3%
f 154
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2076
100.0%
ValueCountFrequency (%)
(unknown) 2092
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 592
28.5%
m 446
21.5%
a 446
21.5%
l 446
21.5%
f 146
 
7.0%
ValueCountFrequency (%)
e 600
28.7%
m 446
21.3%
a 446
21.3%
l 446
21.3%
f 154
 
7.4%

Age
Real number (ℝ)

 Dataset ADataset B
Distinct7272
Distinct (%)20.9%19.9%
Missing10285
Missing (%)22.9%19.1%
Infinite00
Infinite (%)0.0%0.0%
Mean29.03200629.889197
 Dataset ADataset B
Minimum0.420.75
Maximum7480
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:01.819382image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum0.420.75
5-th percentile44
Q120.7521
median2829
Q336.12538
95-th percentile55.754
Maximum7480
Range73.5879.25
Interquartile range (IQR)15.37517

Descriptive statistics

 Dataset ADataset B
Standard deviation14.38435714.089438
Coefficient of variation (CV)0.49546550.47138899
Kurtosis0.305456450.41501417
Mean29.03200629.889197
Median Absolute Deviation (MAD)88
Skewness0.362812160.33798708
Sum9987.0110790
Variance206.90974198.51227
MonotonicityNot monotonicNot monotonic
2025-03-24T21:59:01.966869image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30 18
 
4.0%
24 16
 
3.6%
28 13
 
2.9%
25 13
 
2.9%
19 12
 
2.7%
29 11
 
2.5%
22 11
 
2.5%
27 11
 
2.5%
21 11
 
2.5%
35 10
 
2.2%
Other values (62) 218
48.9%
(Missing) 102
22.9%
ValueCountFrequency (%)
24 20
 
4.5%
22 16
 
3.6%
30 14
 
3.1%
18 13
 
2.9%
36 13
 
2.9%
29 13
 
2.9%
28 12
 
2.7%
31 11
 
2.5%
33 11
 
2.5%
25 11
 
2.5%
Other values (62) 227
50.9%
(Missing) 85
 
19.1%
ValueCountFrequency (%)
0.42 1
 
0.2%
0.67 1
 
0.2%
0.92 1
 
0.2%
1 5
1.1%
2 7
1.6%
3 1
 
0.2%
4 5
1.1%
5 2
 
0.4%
6 3
0.7%
7 3
0.7%
ValueCountFrequency (%)
0.75 1
 
0.2%
0.83 1
 
0.2%
0.92 1
 
0.2%
1 2
 
0.4%
2 5
1.1%
3 4
0.9%
4 5
1.1%
5 2
 
0.4%
6 3
0.7%
7 2
 
0.4%
ValueCountFrequency (%)
0.75 1
 
0.2%
0.83 1
 
0.2%
0.92 1
 
0.2%
1 2
 
0.4%
2 5
1.1%
3 4
0.9%
4 5
1.1%
5 2
 
0.4%
6 3
0.7%
7 2
 
0.4%
ValueCountFrequency (%)
0.42 1
 
0.2%
0.67 1
 
0.2%
0.92 1
 
0.2%
1 5
1.1%
2 7
1.6%
3 1
 
0.2%
4 5
1.1%
5 2
 
0.4%
6 3
0.7%
7 3
0.7%

SibSp
Real number (ℝ)

 Dataset ADataset B
Distinct77
Distinct (%)1.6%1.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.571748880.44843049
 Dataset ADataset B
Minimum00
Maximum88
Zeros312315
Zeros (%)70.0%70.6%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:02.063487image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q311
95-th percentile32
Maximum88
Range88
Interquartile range (IQR)11

Descriptive statistics

 Dataset ADataset B
Standard deviation1.30788790.91959187
Coefficient of variation (CV)2.28752172.0506899
Kurtosis15.96549115.434885
Mean0.571748880.44843049
Median Absolute Deviation (MAD)00
Skewness3.69888983.3135527
Sum255200
Variance1.71057090.84564922
MonotonicityNot monotonicNot monotonic
2025-03-24T21:59:02.285713image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 312
70.0%
1 93
 
20.9%
2 12
 
2.7%
3 10
 
2.2%
4 8
 
1.8%
8 7
 
1.6%
5 4
 
0.9%
ValueCountFrequency (%)
0 315
70.6%
1 97
 
21.7%
2 16
 
3.6%
4 8
 
1.8%
3 7
 
1.6%
5 2
 
0.4%
8 1
 
0.2%
ValueCountFrequency (%)
0 312
70.0%
1 93
 
20.9%
2 12
 
2.7%
3 10
 
2.2%
4 8
 
1.8%
5 4
 
0.9%
8 7
 
1.6%
ValueCountFrequency (%)
0 315
70.6%
1 97
 
21.7%
2 16
 
3.6%
3 7
 
1.6%
4 8
 
1.8%
5 2
 
0.4%
8 1
 
0.2%
ValueCountFrequency (%)
0 315
70.6%
1 97
 
21.7%
2 16
 
3.6%
3 7
 
1.6%
4 8
 
1.8%
5 2
 
0.4%
8 1
 
0.2%
ValueCountFrequency (%)
0 312
70.0%
1 93
 
20.9%
2 12
 
2.7%
3 10
 
2.2%
4 8
 
1.8%
5 4
 
0.9%
8 7
 
1.6%

Parch
Categorical

 Dataset ADataset B
Distinct57
Distinct (%)1.1%1.6%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
0
339 
1
58 
2
46 
4
 
2
3
 
1
0
341 
1
54 
2
40 
5
 
5
4
 
3
Other values (2)
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters446
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique11 ?
Unique (%)0.2%0.2%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 339
76.0%
1 58
 
13.0%
2 46
 
10.3%
4 2
 
0.4%
3 1
 
0.2%
ValueCountFrequency (%)
0 341
76.5%
1 54
 
12.1%
2 40
 
9.0%
5 5
 
1.1%
4 3
 
0.7%
3 2
 
0.4%
6 1
 
0.2%

Length

2025-03-24T21:59:02.358214image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-03-24T21:59:02.413144image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:59:02.482171image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 339
76.0%
1 58
 
13.0%
2 46
 
10.3%
4 2
 
0.4%
3 1
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0 339
76.0%
1 58
 
13.0%
2 46
 
10.3%
4 2
 
0.4%
3 1
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 339
76.0%
1 58
 
13.0%
2 46
 
10.3%
4 2
 
0.4%
3 1
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 339
76.0%
1 58
 
13.0%
2 46
 
10.3%
4 2
 
0.4%
3 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 339
76.0%
1 58
 
13.0%
2 46
 
10.3%
4 2
 
0.4%
3 1
 
0.2%

Ticket
['Text', 'Text']

 Dataset ADataset B
Distinct377381
Distinct (%)84.5%85.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:02.916223image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1818
Median length1717
Mean length6.92825116.8206278
Min length34

Characters and Unicode

 Dataset ADataset B
Total characters30903042
Distinct characters3232
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique327333 ?
Unique (%)73.3%74.7%

Sample

 Dataset ADataset B
1st row36928237671
2nd row248698349218
3rd row315094230136
4th rowC 4001PC 17608
5th row2641PC 17757
ValueCountFrequency (%)
pc 32
 
5.5%
ca 12
 
2.1%
a/5 8
 
1.4%
c.a 8
 
1.4%
ston/o 7
 
1.2%
2 7
 
1.2%
2343 7
 
1.2%
sc/paris 7
 
1.2%
soton/oq 6
 
1.0%
w./c 6
 
1.0%
Other values (397) 479
82.7%
ValueCountFrequency (%)
pc 31
 
5.5%
c.a 13
 
2.3%
a/5 8
 
1.4%
2 7
 
1.2%
ston/o 7
 
1.2%
347082 5
 
0.9%
ca 5
 
0.9%
sc/paris 5
 
0.9%
soton/oq 4
 
0.7%
113760 4
 
0.7%
Other values (403) 478
84.3%
2025-03-24T21:59:03.443504image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 388
12.6%
1 333
10.8%
2 292
9.4%
7 239
 
7.7%
4 237
 
7.7%
6 214
 
6.9%
0 213
 
6.9%
5 184
 
6.0%
9 156
 
5.0%
8 145
 
4.7%
Other values (22) 689
22.3%
ValueCountFrequency (%)
3 391
12.9%
1 357
11.7%
2 309
10.2%
7 238
 
7.8%
4 232
 
7.6%
0 212
 
7.0%
6 209
 
6.9%
5 189
 
6.2%
9 160
 
5.3%
8 140
 
4.6%
Other values (22) 605
19.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3090
100.0%
ValueCountFrequency (%)
(unknown) 3042
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 388
12.6%
1 333
10.8%
2 292
9.4%
7 239
 
7.7%
4 237
 
7.7%
6 214
 
6.9%
0 213
 
6.9%
5 184
 
6.0%
9 156
 
5.0%
8 145
 
4.7%
Other values (22) 689
22.3%
ValueCountFrequency (%)
3 391
12.9%
1 357
11.7%
2 309
10.2%
7 238
 
7.8%
4 232
 
7.6%
0 212
 
7.0%
6 209
 
6.9%
5 189
 
6.2%
9 160
 
5.3%
8 140
 
4.6%
Other values (22) 605
19.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3090
100.0%
ValueCountFrequency (%)
(unknown) 3042
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 388
12.6%
1 333
10.8%
2 292
9.4%
7 239
 
7.7%
4 237
 
7.7%
6 214
 
6.9%
0 213
 
6.9%
5 184
 
6.0%
9 156
 
5.0%
8 145
 
4.7%
Other values (22) 689
22.3%
ValueCountFrequency (%)
3 391
12.9%
1 357
11.7%
2 309
10.2%
7 238
 
7.8%
4 232
 
7.6%
0 212
 
7.0%
6 209
 
6.9%
5 189
 
6.2%
9 160
 
5.3%
8 140
 
4.6%
Other values (22) 605
19.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3090
100.0%
ValueCountFrequency (%)
(unknown) 3042
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 388
12.6%
1 333
10.8%
2 292
9.4%
7 239
 
7.7%
4 237
 
7.7%
6 214
 
6.9%
0 213
 
6.9%
5 184
 
6.0%
9 156
 
5.0%
8 145
 
4.7%
Other values (22) 689
22.3%
ValueCountFrequency (%)
3 391
12.9%
1 357
11.7%
2 309
10.2%
7 238
 
7.8%
4 232
 
7.6%
0 212
 
7.0%
6 209
 
6.9%
5 189
 
6.2%
9 160
 
5.3%
8 140
 
4.6%
Other values (22) 605
19.9%

Fare
Real number (ℝ)

 Dataset ADataset B
Distinct170174
Distinct (%)38.1%39.0%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean33.24685133.743067
 Dataset ADataset B
Minimum00
Maximum512.3292512.3292
Zeros79
Zeros (%)1.6%2.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:03.569541image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile7.2257.225
Q17.89587.8958
median14.213.5
Q331.2062530.5
95-th percentile130.2375133.65
Maximum512.3292512.3292
Range512.3292512.3292
Interquartile range (IQR)23.3104522.6042

Descriptive statistics

 Dataset ADataset B
Standard deviation52.88030157.79157
Coefficient of variation (CV)1.59053561.7126947
Kurtosis32.86707632.917728
Mean33.24685133.743067
Median Absolute Deviation (MAD)6.956.25
Skewness4.79979874.9797293
Sum14828.09615049.408
Variance2796.32623339.8656
MonotonicityNot monotonicNot monotonic
2025-03-24T21:59:03.717442image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.75 24
 
5.4%
13 22
 
4.9%
8.05 22
 
4.9%
26 17
 
3.8%
7.8958 15
 
3.4%
7.2292 11
 
2.5%
7.925 11
 
2.5%
26.55 8
 
1.8%
10.5 8
 
1.8%
7.225 7
 
1.6%
Other values (160) 301
67.5%
ValueCountFrequency (%)
7.8958 26
 
5.8%
8.05 21
 
4.7%
13 18
 
4.0%
7.75 16
 
3.6%
26 16
 
3.6%
10.5 15
 
3.4%
7.925 12
 
2.7%
0 9
 
2.0%
7.25 7
 
1.6%
7.225 7
 
1.6%
Other values (164) 299
67.0%
ValueCountFrequency (%)
0 7
1.6%
5 1
 
0.2%
6.4375 1
 
0.2%
6.4958 1
 
0.2%
6.75 2
 
0.4%
6.975 1
 
0.2%
7.0458 1
 
0.2%
7.05 3
0.7%
7.0542 1
 
0.2%
7.125 3
0.7%
ValueCountFrequency (%)
0 9
2.0%
6.2375 1
 
0.2%
6.4958 2
 
0.4%
6.75 1
 
0.2%
6.975 1
 
0.2%
7.0458 1
 
0.2%
7.05 3
 
0.7%
7.0542 1
 
0.2%
7.125 2
 
0.4%
7.225 7
1.6%
ValueCountFrequency (%)
0 9
2.0%
6.2375 1
 
0.2%
6.4958 2
 
0.4%
6.75 1
 
0.2%
6.975 1
 
0.2%
7.0458 1
 
0.2%
7.05 3
 
0.7%
7.0542 1
 
0.2%
7.125 2
 
0.4%
7.225 7
1.6%
ValueCountFrequency (%)
0 7
1.6%
5 1
 
0.2%
6.4375 1
 
0.2%
6.4958 1
 
0.2%
6.75 2
 
0.4%
6.975 1
 
0.2%
7.0458 1
 
0.2%
7.05 3
0.7%
7.0542 1
 
0.2%
7.125 3
0.7%

Cabin
['Text', 'Text']

 Dataset ADataset B
Distinct7879
Distinct (%)82.1%83.2%
Missing351351
Missing (%)78.7%78.7%
Memory size7.0 KiB7.0 KiB
2025-03-24T21:59:04.088518image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1115
Median length33
Mean length3.70526323.6315789
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters352345
Distinct characters1918
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique6365 ?
Unique (%)66.3%68.4%

Sample

 Dataset ADataset B
1st rowD56F4
2nd rowA31B57 B59 B63 B66
3rd rowB94E67
4th rowB96 B98B86
5th rowC22 C26C101
ValueCountFrequency (%)
b96 3
 
2.7%
b98 3
 
2.7%
c23 3
 
2.7%
c25 3
 
2.7%
c27 3
 
2.7%
b51 2
 
1.8%
b53 2
 
1.8%
b55 2
 
1.8%
b22 2
 
1.8%
c123 2
 
1.8%
Other values (77) 88
77.9%
ValueCountFrequency (%)
b96 4
 
3.6%
b98 4
 
3.6%
b49 2
 
1.8%
e67 2
 
1.8%
d26 2
 
1.8%
f2 2
 
1.8%
c83 2
 
1.8%
e44 2
 
1.8%
b77 2
 
1.8%
c124 2
 
1.8%
Other values (79) 87
78.4%
2025-03-24T21:59:04.519985image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 43
12.2%
C 43
12.2%
B 31
 
8.8%
1 30
 
8.5%
5 29
 
8.2%
3 24
 
6.8%
7 19
 
5.4%
18
 
5.1%
4 18
 
5.1%
8 17
 
4.8%
Other values (9) 80
22.7%
ValueCountFrequency (%)
C 35
 
10.1%
B 34
 
9.9%
2 31
 
9.0%
1 29
 
8.4%
6 24
 
7.0%
5 22
 
6.4%
9 20
 
5.8%
3 20
 
5.8%
4 20
 
5.8%
8 19
 
5.5%
Other values (8) 91
26.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 352
100.0%
ValueCountFrequency (%)
(unknown) 345
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 43
12.2%
C 43
12.2%
B 31
 
8.8%
1 30
 
8.5%
5 29
 
8.2%
3 24
 
6.8%
7 19
 
5.4%
18
 
5.1%
4 18
 
5.1%
8 17
 
4.8%
Other values (9) 80
22.7%
ValueCountFrequency (%)
C 35
 
10.1%
B 34
 
9.9%
2 31
 
9.0%
1 29
 
8.4%
6 24
 
7.0%
5 22
 
6.4%
9 20
 
5.8%
3 20
 
5.8%
4 20
 
5.8%
8 19
 
5.5%
Other values (8) 91
26.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 352
100.0%
ValueCountFrequency (%)
(unknown) 345
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 43
12.2%
C 43
12.2%
B 31
 
8.8%
1 30
 
8.5%
5 29
 
8.2%
3 24
 
6.8%
7 19
 
5.4%
18
 
5.1%
4 18
 
5.1%
8 17
 
4.8%
Other values (9) 80
22.7%
ValueCountFrequency (%)
C 35
 
10.1%
B 34
 
9.9%
2 31
 
9.0%
1 29
 
8.4%
6 24
 
7.0%
5 22
 
6.4%
9 20
 
5.8%
3 20
 
5.8%
4 20
 
5.8%
8 19
 
5.5%
Other values (8) 91
26.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 352
100.0%
ValueCountFrequency (%)
(unknown) 345
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 43
12.2%
C 43
12.2%
B 31
 
8.8%
1 30
 
8.5%
5 29
 
8.2%
3 24
 
6.8%
7 19
 
5.4%
18
 
5.1%
4 18
 
5.1%
8 17
 
4.8%
Other values (9) 80
22.7%
ValueCountFrequency (%)
C 35
 
10.1%
B 34
 
9.9%
2 31
 
9.0%
1 29
 
8.4%
6 24
 
7.0%
5 22
 
6.4%
9 20
 
5.8%
3 20
 
5.8%
4 20
 
5.8%
8 19
 
5.5%
Other values (8) 91
26.4%

Embarked
Categorical

 Dataset ADataset B
Distinct33
Distinct (%)0.7%0.7%
Missing01
Missing (%)0.0%0.2%
Memory size7.0 KiB7.0 KiB
S
316 
C
80 
Q
50 
S
332 
C
78 
Q
35 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446445
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowSS
2nd rowSS
3rd rowSS
4th rowSC
5th rowCC

Common Values

ValueCountFrequency (%)
S 316
70.9%
C 80
 
17.9%
Q 50
 
11.2%
ValueCountFrequency (%)
S 332
74.4%
C 78
 
17.5%
Q 35
 
7.8%
(Missing) 1
 
0.2%

Length

2025-03-24T21:59:04.599849image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-03-24T21:59:04.649597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:59:04.690249image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
s 316
70.9%
c 80
 
17.9%
q 50
 
11.2%
ValueCountFrequency (%)
s 332
74.6%
c 78
 
17.5%
q 35
 
7.9%

Most occurring characters

ValueCountFrequency (%)
S 316
70.9%
C 80
 
17.9%
Q 50
 
11.2%
ValueCountFrequency (%)
S 332
74.6%
C 78
 
17.5%
Q 35
 
7.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 445
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 316
70.9%
C 80
 
17.9%
Q 50
 
11.2%
ValueCountFrequency (%)
S 332
74.6%
C 78
 
17.5%
Q 35
 
7.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 445
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 316
70.9%
C 80
 
17.9%
Q 50
 
11.2%
ValueCountFrequency (%)
S 332
74.6%
C 78
 
17.5%
Q 35
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 446
100.0%
ValueCountFrequency (%)
(unknown) 445
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 316
70.9%
C 80
 
17.9%
Q 50
 
11.2%
ValueCountFrequency (%)
S 332
74.6%
C 78
 
17.5%
Q 35
 
7.9%

Interactions

Dataset A

2025-03-24T21:58:57.251681image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:59.376684image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.500440image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.093498image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.743482image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.395189image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.995660image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.723562image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:59.056517image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:57.311037image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:59.435554image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.560726image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.149333image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.806248image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.459195image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:57.060867image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.788526image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:59.118567image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:57.374274image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:59.502231image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.620261image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.214303image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.868334image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.525651image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:57.120275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.854537image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:59.184369image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:57.442323image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:59.634659image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.685534image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.337644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:56.933758image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.656426image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-03-24T21:58:57.188910image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:58:58.990367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:59.313451image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:59.570046image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:58.589871image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:58.276787image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:58.923357image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A


Interaction plot not present for dataset

Dataset B

2025-03-24T21:58:59.251250image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

Dataset A

2025-03-24T21:59:04.738135image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-03-24T21:59:04.843678image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0000.1320.3020.0040.2850.055-0.1860.148
Embarked0.0001.0000.1380.0740.0240.2380.0910.0730.129
Fare0.1320.1381.0000.221-0.0510.4500.1480.4690.203
Parch0.3020.0740.2211.0000.0000.0000.2210.3120.125
PassengerId0.0040.024-0.0510.0001.0000.0390.081-0.0620.091
Pclass0.2850.2380.4500.0000.0391.0000.0250.1720.298
Sex0.0550.0910.1480.2210.0810.0251.0000.1730.508
SibSp-0.1860.0730.4690.312-0.0620.1720.1731.0000.148
Survived0.1480.1290.2030.1250.0910.2980.5080.1481.000

Dataset B

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0000.113-0.2490.0850.2170.078-0.1730.227
Embarked0.0001.0000.1820.0000.0000.2440.2030.0000.145
Fare0.1130.1821.0000.438-0.0300.4930.1030.4460.270
Parch-0.2490.0000.4381.000-0.0170.0660.2410.4030.180
PassengerId0.0850.000-0.030-0.0171.0000.0000.126-0.0890.136
Pclass0.2170.2440.4930.0660.0001.0000.0000.1370.279
Sex0.0780.2030.1030.2410.1260.0001.0000.1840.535
SibSp-0.1730.0000.4460.403-0.0890.1370.1841.0000.091
Survived0.2270.1450.2700.1800.1360.2790.5350.0911.000

Missing values

Dataset A

2025-03-24T21:58:57.651026image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.

Dataset B

2025-03-24T21:58:59.734226image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.

Dataset A

2025-03-24T21:58:57.734443image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Dataset B

2025-03-24T21:58:59.817853image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Dataset A

2025-03-24T21:58:57.822632image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Dataset B

2025-03-24T21:59:00.050861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
85685711Wick, Mrs. George Dennick (Mary Hitchcock)female45.01136928164.8667NaNS
212212Beesley, Mr. Lawrencemale34.00024869813.0000D56S
72572603Oreskovic, Mr. Lukamale20.0003150948.6625NaNS
50850903Olsen, Mr. Henry Margidomale28.000C 400122.5250NaNS
53153203Toufik, Mr. NaklimaleNaN0026417.2292NaNC
85485502Carter, Mrs. Ernest Courtenay (Lilian Hughes)female44.01024425226.0000NaNS
47647702Renouf, Mr. Peter Henrymale34.0103102721.0000NaNS
20921011Blank, Mr. Henrymale40.00011227731.0000A31C
26326401Harrison, Mr. Williammale40.0001120590.0000B94S
53253303Elias, Mr. Joseph Jrmale17.01126907.2292NaNC

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
35735802Funk, Miss. Annie Clemmerfemale38.00023767113.0000NaNS
73974003Nankoff, Mr. MinkomaleNaN003492187.8958NaNS
18318412Becker, Master. Richard Fmale1.02123013639.0000F4S
31131211Ryerson, Miss. Emily Boriefemale18.022PC 17608262.3750B57 B59 B63 B66C
38038111Bidois, Miss. Rosaliefemale42.000PC 17757227.5250NaNC
14414502Andrew, Mr. Edgardo Samuelmale18.00023194511.5000NaNS
56856903Doharr, Mr. TannousmaleNaN0026867.2292NaNC
36736813Moussa, Mrs. (Mantoura Boulos)femaleNaN0026267.2292NaNC
13513602Richard, Mr. Emilemale23.000SC/PARIS 213315.0458NaNC
323313Glynn, Miss. Mary AgathafemaleNaN003356777.7500NaNQ

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
41341402Cunningham, Mr. Alfred FlemingmaleNaN002398530.0000NaNS
39239303Gustafsson, Mr. Johan Birgermale28.02031012777.9250NaNS
44244303Petterson, Mr. Johan Emilmale25.0103470767.7750NaNS
81681703Heininen, Miss. Wendla Mariafemale23.000STON/O2. 31012907.9250NaNS
39739802McKane, Mr. Peter Davidmale46.0002840326.0000NaNS
10210301White, Mr. Richard Frasarmale21.0013528177.2875D26S
88488503Sutehall, Mr. Henry Jrmale25.000SOTON/OQ 3920767.0500NaNS
28028103Duane, Mr. Frankmale65.0003364397.7500NaNQ
45645701Millet, Mr. Francis Davismale65.0001350926.5500E38S
47948013Hirvonen, Miss. Hildur Efemale2.001310129812.2875NaNS

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
45445503Peduzzi, Mr. JosephmaleNaN00A/5 28178.0500NaNS
14915002Byles, Rev. Thomas Roussel Davidsmale42.00024431013.0000NaNS
19819913Madigan, Miss. Margaret "Maggie"femaleNaN003703707.7500NaNQ
23023111Harris, Mrs. Henry Birkhardt (Irene Wallach)female35.0103697383.4750C83S
32432503Sage, Mr. George John JrmaleNaN82CA. 234369.5500NaNS
24824911Beckwith, Mr. Richard Leonardmale37.0111175152.5542D35S
38138213Nakid, Miss. Maria ("Mary")female1.002265315.7417NaNC
53954011Frolicher, Miss. Hedwig Margarithafemale22.0021356849.5000B39C
11811901Baxter, Mr. Quigg Edmondmale24.001PC 17558247.5208B58 B60C
19419511Brown, Mrs. James Joseph (Margaret Tobin)female44.000PC 1761027.7208B4C

Duplicate rows

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.